International Journal of Engineering and Technical Research (IJETR) 
ISSN: 2321-0869, Volume-3, Issue-5, May 2015 


Evaluation of Spark Ignition Engine Emission 
Logistic Regression Model 

L.K. Langat, W.O. Ogola, J.K. Korir, D.K Chirchir 


Abstract — The transport sector is expected to be responsible 
for about 75% of carbon emission by the year 2020 and 
therefore reducing transport sector carbon emissions will be 
crucial for stabilizing atmospheric concentrations of greenhouse 
gases (US EPA, 2002). The research therefore, sought to 
establish the major contributing factors and develop logistic 
regression model based in their category, usage and engine 
operating parameters. The sample size was 384 petrol vehicles 
randomly selected. The key observations included vehicle usage, 
compression pressure, ignition angle, engine speed, spark plug 
gap, and vehicle category. The key variables examined for 
emission were CO, HC, C0 2 , excess air factor (lambda) and 
factors that influence emissions. Logistic regression model was 
fitted to determine the probability of tested vehicles failing 
emission tests based on the test variables. The mean vehicle 
usage ranged between 14328 km/yr and 19640 km/yr and the 
lowest compression pressure, 6.8 bar was recorded in the 
non-catalytic vehicles manufactured before 1986. Both 
categories of non-catalytic vehicles operated at rich mixture. 
Logistic regression model showed that the coefficient of the 
engine parameters namely; vehicle usage, compression pressure, 
ignition angle, engine speed, spark plug gap and lambda were 
statistically significant in contributing to the probability of 
failing or passing of a vehicle. The null hypothesis of no 
significant regression was strongly rejected for all categories of 
vehicles at 5 % significance level. 


Index Terms — Vehicle emissions, emissions factors, 
regression model 

I. INTRODUCTION 

Vehicles emissions, which occur near ground level and in 
densely populated areas, cause much greater human exposure 
to harmful pollutants in the immediate locality than do 
emissions from source such as power plants that are situated 
at elevated levels and farther away from dense populated 
centres. In addition vehicle exhaust particles being small and 
numerous can be expected to have considerable health 
impacts. Pollution abatement in the transport sector is 
therefore becoming more important factor in urban air quality 
management strategies (Kojima and Lovei, 2001; Gwilliam et 
al, 2004). Real-world vehicle emissions are highly variable. 
Several factors account for the variability in emissions in 
different vehicles and the amount of environment damage 
caused (Shehata and Razek, 2008). However, due to relatively 
higher average temperatures, poor fuel quality, poor vehicle 
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maintenance culture and high proportion of old vehicles, the 
level of emissions from mobile sources are usually high 
(Subramanian et al., 2007; Kojima and Lovei, 2001; Mulaku 
and Karuiki, 2001; Whitelegg and Hag, 2003 ;Bin, 2003; 
Choo etal, 2007). 

As vehicles age and accumulate mileage, their emissions 
tend to increase. This is both a function of normal degradation 
of emissions controls of properly functioning vehicles, 
resulting in moderate emissions increases, and malfunction or 
outright failure of emissions controls on some vehicles, 
possibly resulting in very large increases in emissions, 
particularly CO and HC (Wenzel, 1999; Washburn et al, 
2001; Bin, 2003). However, exhaust emission as a function of 
age and mileage accumulation can vary depending on vehicle 
maintenance culture. A more accurate way of determining the 
influence of the two variables in exhaust emission is by 
considering vehicle annual usage. This is obtained by 
dividing total mileage accumulated by vehicle age (US EPA, 
2002, BAQ, 2002). 

A. Vehicle models 

Some vehicle models are simply designed and 
manufactured better than others. Some vehicle models and 
engine families are observed to have very low average 
emissions while others exhibit very high rates of emissions 
control failure (Caton, 2003; Bureau of Automotive Repair, 
2003). The design of a particular emissions control system 
affects both the initial effectiveness and the lifetime durability 
of the system, which in turn contributes to a model- specific 
emissions rate (Fomunung, 2000). 

B. Maintenance and tampering 

The degree to which owners maintain their vehicles by 
providing tune-ups and servicing according to manufacturer 
schedules can affect the likelihood of engine or emissions 
control system failure and therefore tailpipe emissions. 
Outright tampering with vehicles, such as removing fuel tank 
inlet restrictors to permit fueling with leaded fuel that will 
degrade the catalytic converter or tuning engines to improve 
performance, can have a large impact on emissions (Wenzel 
et al, 2000, Bureau of Automotive Repair, 2002; Bin, 2003). 
Early inspection and maintenance (I/M) programs relied on 
visual inspection to discourage tampering. The advent of 
sophisticated on-board computers and sensors has greatly 
reduced the incentive to improve vehicle performance 
through tampering. In fact, tampering with the sophisticated 
electronics installed on today's vehicles will likely reduce 
performance as well as increase emissions. Requirements for 
extended manufacturer warranties have led to vehicle designs 
that are less sensitive to maintenance, at least within the 
warranty period. Nonetheless, there is evidence that 
maintenance can still affect real-world emissions from new 
vehicles, at least on some models (Michalek et al., 2004; Bin, 
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2003; Choo et al, 2007). Improper maintenance or repair can 
also lead to higher emissions (Michalek et al., 2004; Bureau 
of Automotive Repair, 2003). The cumulative effects of hard 
driving or 'misuse" of a vehicle can also increase emissions. 
For example, prolonged high power driving, such as repeated 
towing of a trailer up mountain grades, leading to high engine 
temperatures can cause premature damage to catalytic 
converter, resulting in dramatic increase in emissions 
(Osborne, 2007). 

Washburn et al, 2001). There are many emissions control 
components that can malfunction or fail. Some of these 
malfunctions are interpreted; for instance, the onboard 
computer of a vehicle with a failed oxygen sensor may 
command a constant fuel enrichment, which can eventually 
lead to catalyst failure. Different component malfunctions 
result in very different emissions consequences. In general, 
malfunctioning vehicles with high CO emissions tend also to 
have high HC emissions, while vehicles with high NOx 
emissions tend to have relatively low CO and HC emissions 
(Wenzel etal, 2000; Bureau of Automotive Repair, 2002). 


n. Logistic regression models 

Logistic regression models explain the probability of an event 
occurring given certain input variables. The models have 
been used in vehicles emission analysis to explain the 
probability of vehicles emission characteristics when certain 
emission input variables are given. The Radian HEP model 
(Assanis et al., 2003) is a logistic regression model with input 
variables that include vehicle type, model year, catalytic 
converters, odometer readings and type of fuel system. Many 
of the variables have been identified in the literature as being 
correlated with high emitting vehicles (Osborne, 2007; Choo 
et al, 2007). For examples, vehicle characteristics such as 
vehicle age (model year, odometer readings (mileage), fuel 
type and fuel system have identified with higher emission or 
higher failure rates (Osborne, 2007; Kahn, 1996; Washburnn 
et al, 2001; Bin, 2003). Other technology based relationships 
that have been explored in logistic regression modelling 
include those between the failure rates and repairs of specific 
emissions control systems components such as catalyst, 
oxygen sensors or exhaust recirculation (EGR) and high 
emission (Prucka et al., 2010; Mohammadia et al., 2007; 
Chooet al, 2007). 

However, the models developed dependent on vehicle 
characteristics and emission test variables and they can only 
be used on vehicles with the same characteristics with the 
vehicles used in model development (Choo et al, 2007). This 
study therefore adopted the approach to develop a logistic 
model to determine which of the engine input variables 
vehicle usage, compression pressure, ignition angle, engine 
speed and spark plug gap contributed to vehicles passing or 
faihng exhaust emission tests based on KS 1515 standards. 
The identified variables were further used to non-linear 
regression models to explain the effects of the engine 
operating parameters on engine performance and emission 
characteristics. 


III. MATERIALS AND METHODS 

A. Determination of factors that affect exhaust 
emissions 

Vehicle category, vehicle usage and engine operating 
parameters were used as measures of emission levels. Vehicle 
usage was calculated by dividing mileage by age while age 
was calculated from the date of manufacture as indicated in 
the log book and mileage accumulation was obtained directly 
from the odometer. For vehicles whose odometer stopped 
working, mileage accumulation was calculated from a 
regression model develop by US EPA (US EPA, 2002). 

Accumulated use (km) 

= 489 x (Yrs before Kenya) + 19023 x (Yrs in 
Kenya)-458.3 x (Yrs in Kenya) 2 (2.1) 

Exhaust emission tests were determined using AGS-200 
exhaust gas analyzer with the engine warm and enrichment 
devices not operating. The engine was required to remain 
idling and was not subjected to any significant electrical 
loading. The exhaust system was ensured to be free from any 
leakage. For exhaust gases, the test criterion was based on 
KS 1515-2000 specifications, where the tested vehicles were 
expected to meet individual gas limits. The limits for CO were 
given as 4.5%, 3.5%, 0.5% and 0.2% for non-catalytic 
vehicles before 1986, non-catalytic vehicles between 1986 
and 2002, catalytic vehicles between 1986 and 2002 and 
catalytic vehicles after 2002 respectively. HC limits were 
1200 ppm for non-catalytic vehicles before 1986, 
non-catalytic vehicles between 1986 and 2002, while 250 
ppm and 200 ppm for catalytic vehicles between 1986 and 
2002 and catalytic vehicles after 2002 respectively. The limit 
for air excess factor lambda (7) was only considered for 
catalytic vehicles between 1986 and 2002 and catalytic 
vehicles after 2002 which was taken as 1.00 ± 0.03. However, 
the overall test results of pass or fail was based on CO limits. 

(i) Non-catalyst test 

Temperature and engine speed probe was connected to the 
engine to obtain the temperature and engine speed readings as 
shown in Fig. 3.3. Exhaust gas analyzer probe was also fitted 
in the exhaust pipe to determine the proportions of carbon 
monoxide (CO), hydrocarbon (HC), carbon dioxide (CO2) 
and air/fuel ratio in the exhaust gas over a period of 5 seconds 
at idle speed. If the vehicle met the CO requirements at its 
normal idling speed but failed the HC, the HC levels were 
checked at high idle speed of 2000 rpm. 

(ii) Catalyst test 

Carbon monoxide (CO), hydrocarbon (HC) and lambda 
were measured at fast idle speed and CO checked again at idle 
speed. The 1 st Fast idle speed test was done by raising the 
engine speed to the vehicle specific fast idle speed mostly 
between 2500-3000 rpm and maintained for 30 seconds. CO, 
HC and air/fuel ratio values were recorded in the last 5 
seconds as Basic Emission Test (BET) results. If the vehicle 
failed the 1 st idle speed additional engine pre-conditioning 
was done by running the engine between 2000-3000 rpm for 3 
minutes or until all emissions were within limits. After engine 
pre-conditioning, 2 nd fast idle speed were done by repeating 
the procedure of 1 st fast idle test. This was followed by 
catalyst stabilization which required the vehicle specific fast 
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idle speed be maintained for 30 seconds. Finally the engine 
was allowed to idle for 30 seconds and during the last five 
seconds, the CO readings were recorded. 

(iii) Compression pressure 

All the spark plugs were removed from the engine cylinder 
head and the throttle valve was blocked wide open to ensure 
that maximum amount of air enters into the cylinders. Then 
the compressor adaptor was screwed into the spark plug hole 
of cylinder number 1 as shown in Fig. 3.3. To protect the coil 
on high voltage, the primary lead from the negative terminal 
of the coil was disconnected. On electronic system, the 
positive lead to the control unit was disconnected. The throttle 
was held wide open as starter motor was operated to crank the 
engine through the four compression stroke. The needle 
moved around to indicate the maximum compression in the 
cylinder. The same procedure was repeated for the rest of the 
cylinders. 

(iv) Ignition angle 

Ignition angle was determined by use of stroboscope as 
shown in Fig. 3.4. The stroboscope lead was connected to 
number 1 spark plug cable when the engine running at idle 
speed. Each time number 1 plug fired, the stroboscope 
flashed. This happened so quickly that when the light was 
pointed at the crankshaft pulley appears to stand still. A value 
in degrees corresponding to a mark in the pulley was 
recorded. 

(v) Spark plug gap 

Spark plug gap was measured using thickness gauge 
(feeler gage). All spark plugs were removed from the engine 
and their gap checked and recorded for every vehicles. 

B. Data Analysis 

Data were coded and then entered into Microsoft Excel 
and Statistical Analysis System (SAS) version 9.0 for 
analysis. Data cleaning was done and frequencies were run. 
Cross tabulation was done to look for differences and 
relationship among variables. Descriptive analysis was 
carried out on vehicles characteristics and associated factors 
using t-test. Chi-squire test was done to determine exhaust 
emission levels at 5% level of significance. Logistic 
regression model was fitted on tested results and factors 
associated with it namely; vehicle usage, compression 
pressure, ignition angle, engine speed and spark plug gap.The 
fitted logistic regression model was in the form; 

111 (r)= =Po+PxU+p 2 p c +...+p 6 x 

( 2 . 2 ) 

where n =Prob (Y=y/U = x,P ( = x 2 . 1= x 6 ) = 

Y=Test results 
U = Vehicle usage 
P c = Compression pressure 
0 = Ignition angle 
S = Engine speed 
G = Spark plug gap 
X = Lambda 


P t = parameter coefficient 

The fitted logistic model effectiveness was assessed by 
overall model evaluation, statistical teats on the regression 
and the individual estimation parameters. The statistical test 
for the logistic regression coefficients was implemented using 
the Wald Chi-square. Standards engine performance 
equations were used to analyze engine performance 
characteristics from the parameters, while non-linear 
regression models were used to predict engine performance 
and emission based on engine operating parameters. 

IV. RESULTS AND DISCUSSION 

A. Probability of the engine parameters affecting test 
results 

Equation 3.3 was used to determine the probability of a 
vehicle failing the test criteria when the parameters x,, x 2 , x 3 , 
x 4 , x 5 and x 6 . The hypothesis tested was that the likelihood that 
a vehicle fails the test was related to; Y=Test results vehicle 
usage, compression pressure, ignition angle, engine speed and 
spark plug gap. The independent variable was the test results 
while vehicle usage, compression pressure, ignition angle, 
engine speed, spark plug gap and lambda were the 
predictors/explanatory variables. The test results were coded 
as l=Fail and 2=Pass, the vehicle categories were coded as 
l=Non-catalytic before 1986, 2=Non-catalytic between 1986 
and 2002, 3=Catalytic between 1986 and 2002 and 
4=Catalytic after 2002. The model Parameter estimates are 
presented in Table 4.4. Some variables like fuel type, body 
type and transmission type were excluded in the model as 
there was no theoretical justification to include them (Barghi 
and Safavi, 201 l).The estimated parameters were used to test 
the probability of the vehicle failing emission tests were 
related to vehicle usage, compression pressure, ignition angle, 
engine speed and spark plug gap. From the table, the 
probability values suggests that the coefficients P 0 , J3 l , 
P 2 ,P 3 ,P 4 and P 5 apart from P 6 were not statistically 
significant at 5% significance level for vehicles manufacture 
before 1986, while /?,,/?,, /? 3 J3 5 and P 6 were statistically 
significant at 5% for non catalytic vehicles manufactured 
between 1986 and 2002. Also /?, and /? 3 were significant at 
5% for catalytic vehicles manufactured between 1986 and 
2002, while /?, and /? 2 were significant for catalytic vehicles 
after 2002. 

According to the fitted model, the log of odds of a vehicle 
failing the test is positively related to vehicle usage, ignition 
angle and spark plug gap and negatively related to 
compression pressure and lambda for non-catalytic vehicles 
manufactured between 1986 and 2002, while positively 
related to vehicle usage and ignition angle for catalytic 
vehicles between 1986 and 2002. Also the odds of vehicles 
failing was negatively related to vehicle usage and 
compression pressure for catalytic vehicles after 2002. In 
other words, the bigger the values for the positive variables, 
the higher the chances the vehicle failing the test, while the 
smaller the values for the negative variables the higher the 
chances the vehicle failing the test. In overall vehicle usage 
and compression pressure influenced test results more than 
the other parameters. Lambda also influenced the test results 
for non-catalytic vehicles. This is because vehicle usage is a 
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function of normal degradation of emission controls of 
properly functioning vehicles, resulting moderate emission 
increase (Gaeta, et al„ 2011). Compression pressure and fuel 
metering can be affected by lack of proper inspection and 
maintenance which may result in long intervals of vehicle 
service. Long service intervals affect vehicle lubricants 
properties resulting in increase wear which affects 
compression pressure and fuel metering (Ebrahimi,et al., 
2012; Bin 2003). 


Table 3.1: Parameter estimate for logistic Regression model 


Vehicle 

P-value 

Category 

parameter Estimate 

i Std Error 

Non-catalytic J3 Q 464.C 


223.1 

0.9550 

before 1986 


A 

-0.00112 

0.00057 0.8292 

A -15 

7825 

4.6535 0.7238 

A -0.8071 

4.2222 0.9734 

A -0.0706 

0 

0074 

0.9243 


J3 5 34.6242 13.1 

0.9462 


As 

- 16.5329 


4.5352 

0.0013 

Non-catalytic J3 0 -57.5845 

41.6393 

0.1667 

between 1986 

A 

0.000037 0.000021 

0.0777 





and 2002 





02 

-1.7190 


0.0328 

0.0001 

A 

0.00956 


0.00049 

0.0535 

A 

0.1411 


0.0111 

0.2057 

A 

18.8360 


4.2141 

0.0001 

A 

-4.2119 


1.3919 

0.0025 

Catalytic 


A 

523.0 

267.5 

0.1547 





between 1986 

A 

0.000207 0.000077 

0.0073 





and 2002 


A 

-1.3152 

0.4773 

0.3733 





A 

0.0372 


0.0022 

0.0928 

A 1.4733 

0.09831 0.1340 


A 

-28.1704 < 

5.0589 0.2797 


A 

2 

:.9545 ( 

1.634 0.6511 

Catalytic 


A 

17.9671 

2.073 

0.9307 





after 2002 


A 

-0.00018 0.00010 

0.0733 





A 

-10.1167 


4.4870 

0.0242 

A 

0.2216 


0.0542 

0.6827 

A 

0.0553 


0.0078 

0.1818 

A 

-20.4581 


2.6836 

0.1068 

A 

-27.0092 


0.5935 

0.4411 


3.2Evaluations of the fitted Logistic models 

The model effectiveness was assessed by overall model 
evaluation, statistical tests on the regression and the 
individual estimated parameters. The overall model was 
performed by examining the null model (intercept only 
model) and the fitted logistic regression model. The null 
model provides a baseline because it contains no predictors. A 
logistic model is said to be a better fit if its diagnostics are 
smaller than those of the intercept-only model. Consequently, 
the fitted logistic model has a better fit than the null model. 
This is proved by the Akaike Information Criterion (AIC), 
Likelihood ratio and Schwarttz Criterion (SC) tests, all of 
which yield similar conclusion. In all the cases, fitted logistic 
model minimized the AIC and SC while maximized the 
likelihood ratio relative to the null model. The tests are 
presented in the Table 4.5. 

Table 3.2: Overall model evaluation 

The statistical test for the regression coefficients was 
implemented using the Wald Chi-square statistic for the three 
criteria. The results are presented in the Table; 3.3; 



Vehicle category Criterion 

Intercept only 

Intercept and 
Covariates 


Non-catalytic 

AIC 

17.090 

12.017 


before 1986 

SC 

18.586 

20.996 


-2 Log L 

15.090 

0.017 



Non-catalytic 

AIC 

230.804 

121.547 


1986-2002 

SC 

234.067 

141.124 



-2 Log L 

228.804 

109.547 


Catalytic 

AIC 

164.169 

55.529 


1986-2002 

SC 

166.997 

72.499 



-2 Log L 

162.169 

43.529 


Catalytic 

AIC 

47.475 

38.613 


After 2002 

SC 

48.971 

47.598 



-2 Log L 

45.475 

26.613 


Table 3.3: Wald Chi-square 

table 

Vehicle category 

Test Chi-Square 

df P-value 


Non-catalytic 

Likelihood Ratio 

15.0723 

5 0.0101 


before 1986 

Score 

11.3850 

5 0.0443 




Wald 

0.30250 

5 0.9976 


Non-catalytic 

Likelihood Ratio 

119.2569 

5 0.0001 


1986-2002 

Score 

93.9806 

5 0.0001 




Wald 

45.1628 

5 0.0001 


Catalytic 

Likelihood Ratio 

118.6399 

5 0.0001 


1986-2002 

Score 

83.0424 

5 0.0001 




Wald 

18.9953 

5 0.0019 


Catalytic 

likelihood Ratio 

18.8611 

5 0.0020 


after 2002 

Score 

14.5865 

5 0.0123 




Wald 

8.5413 

5 0.1288 



The null hypothesis of no significant regression was strongly 
rejected for both categories of vehicles manufactured between 
1986 & 2002 by the three tests at 5% significance level. For 
non-catalytic vehicles before 1986 and catalytic vehicles after 
2002, the regression model was also considered to be 
significant despite the fact that the Wald statistic failed to 
reject the hypothesis of no significance in regression. In 
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overall, the combination of independent variables (vehicle 
usage, compression pressure, ignition angle, idle speed and 
spark plug gap) significantly contributed to the probability of 
failure or pass for the vehicles studied. 

V. Conclusion 

Logistic regression model showed that the coefficient of 
the engine parameters namely; vehicle usage, compression 
pressure, ignition angle, engine speed, spark plug gap and 
lambda were statistically significant in contributing to the 
probability of failing or passing of a vehicle. The assessment 
of the logistic model showed better fit as fitted model 
minimized the AIC and SC while maximizing the likelihood 
ratio relative to the null model. The null hypothesis of no 
significant regression was strongly rejected for all categories 
of vehicles at 5% significance level. 
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